Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 8159536 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 74091 |
| Duplicate rows (%) | 0.9% |
| Total size in memory | 684.8 MiB |
| Average record size in memory | 88.0 B |
Variable types
| Text | 3 |
|---|---|
| DateTime | 1 |
| Numeric | 6 |
| Categorical | 1 |
Alerts
| Dataset has 74091 (0.9%) duplicate rows | Duplicates |
category_id is highly overall correlated with department_id and 2 other fields | High correlation |
department_id is highly overall correlated with category_id and 2 other fields | High correlation |
parent_id is highly overall correlated with category_id and 2 other fields | High correlation |
salesperson_id is highly overall correlated with category_id and 2 other fields | High correlation |
quantity is highly skewed (γ1 = 41.78601045) | Skewed |
Reproduction
| Analysis started | 2023-12-18 09:22:57.536218 |
|---|---|
| Analysis finished | 2023-12-18 09:29:03.326039 |
| Duration | 6 minutes and 5.79 seconds |
| Software version | ydata-profiling vv4.6.3 |
| Download configuration | config.json |
transaction_id
Text
| Distinct | 3779936 |
|---|---|
| Distinct (%) | 46.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.3 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 261105152 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1938335 ? |
|---|---|
| Unique (%) | 23.8% |
Sample
| 1st row | 2e18343f9b9a95e89587273536e59d6e |
|---|---|
| 2nd row | d53096e90b515b563631b18acfa4d364 |
| 3rd row | d53096e90b515b563631b18acfa4d364 |
| 4th row | 2c2296658e7f9ae94954b5836214de76 |
| 5th row | 2c2296658e7f9ae94954b5836214de76 |
| Value | Count | Frequency (%) |
| 5ad24284bf48eee984a16d0f94a380c4 | 228 | < 0.1% |
| 9e7effb23dd7557287e6672d57cce251 | 205 | < 0.1% |
| c8ba6c7971f047170993d652aded77c9 | 189 | < 0.1% |
| fe261cabab8a92b4e56ab2abbf9440ba | 187 | < 0.1% |
| 58ece7e4171e4020f7f3774797f411fc | 171 | < 0.1% |
| 198fab4051313b11b79e399d0ced77fa | 166 | < 0.1% |
| 5d4be88addad9ae301fcbb7a917ed6dc | 144 | < 0.1% |
| a0527cd9ba6579ec4fa00494fd15a875 | 138 | < 0.1% |
| 98a033d8a20192765b4425c95ce3d8de | 138 | < 0.1% |
| 7c04514f81c60e2eb7c37c76ed5ebdf6 | 134 | < 0.1% |
| Other values (3779926) | 8157836 |
Most occurring characters
| Value | Count | Frequency (%) |
| 8 | 16332932 | 6.3% |
| 7 | 16332859 | 6.3% |
| 9 | 16331253 | 6.3% |
| c | 16329897 | 6.3% |
| 4 | 16329878 | 6.3% |
| e | 16325988 | 6.3% |
| 2 | 16320689 | 6.3% |
| 6 | 16319353 | 6.3% |
| 0 | 16315773 | 6.2% |
| 5 | 16314318 | 6.2% |
| Other values (6) | 97852212 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 163212499 | |
| Lowercase Letter | 97892653 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 16332932 | |
| 7 | 16332859 | |
| 9 | 16331253 | |
| 4 | 16329878 | |
| 2 | 16320689 | |
| 6 | 16319353 | |
| 0 | 16315773 | |
| 5 | 16314318 | |
| 3 | 16308116 | |
| 1 | 16307328 |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 16329897 | |
| e | 16325988 | |
| a | 16312954 | |
| d | 16311965 | |
| f | 16306584 | |
| b | 16305265 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 163212499 | |
| Latin | 97892653 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 8 | 16332932 | |
| 7 | 16332859 | |
| 9 | 16331253 | |
| 4 | 16329878 | |
| 2 | 16320689 | |
| 6 | 16319353 | |
| 0 | 16315773 | |
| 5 | 16314318 | |
| 3 | 16308116 | |
| 1 | 16307328 |
Latin
| Value | Count | Frequency (%) |
| c | 16329897 | |
| e | 16325988 | |
| a | 16312954 | |
| d | 16311965 | |
| f | 16306584 | |
| b | 16305265 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 261105152 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 8 | 16332932 | 6.3% |
| 7 | 16332859 | 6.3% |
| 9 | 16331253 | 6.3% |
| c | 16329897 | 6.3% |
| 4 | 16329878 | 6.3% |
| e | 16325988 | 6.3% |
| 2 | 16320689 | 6.3% |
| 6 | 16319353 | 6.3% |
| 0 | 16315773 | 6.2% |
| 5 | 16314318 | 6.2% |
| Other values (6) | 97852212 |
sales_datetime
Date
| Distinct | 1256305 |
|---|---|
| Distinct (%) | 15.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.3 MiB |
| Minimum | 2011-01-01 09:04:00 |
|---|---|
| Maximum | 2014-10-01 21:00:00 |
customer_id
Text
| Distinct | 204442 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.3 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 2 |
| Mean length | 15.403575 |
| Min length | 2 |
Characters and Unicode
| Total characters | 125686022 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 26913 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | -1 |
|---|---|
| 2nd row | -1 |
| 3rd row | -1 |
| 4th row | -1 |
| 5th row | -1 |
| Value | Count | Frequency (%) |
| 1 | 4513971 | |
| 4acc7bfe5080965a63f05e1c3852a4ad | 8583 | 0.1% |
| ff202c04cfc8394a5e43a11012e11d93 | 8577 | 0.1% |
| 2593d7b4a54b8a3fd01144d17f1949a5 | 8573 | 0.1% |
| 3ff2a371691a832d9ba1d51fdaeac07b | 8547 | 0.1% |
| 712e4c5edc1e1e67ea67061f78678612 | 8533 | 0.1% |
| 5469d95f04a89400f2491ea0a9653dee | 8494 | 0.1% |
| 4f59ff2f20f2ad74aae3eaf596c8f978 | 8487 | 0.1% |
| d34e761991ca5b798972ec7e328253be | 8484 | 0.1% |
| f17ece610fb7cc5935a92f68fe387e0e | 8481 | 0.1% |
| Other values (204432) | 3568806 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 11986309 | 9.5% |
| a | 7465781 | 5.9% |
| 8 | 7446464 | 5.9% |
| 5 | 7411674 | 5.9% |
| f | 7392447 | 5.9% |
| d | 7344098 | 5.8% |
| c | 7300016 | 5.8% |
| 3 | 7284894 | 5.8% |
| 9 | 7248636 | 5.8% |
| 0 | 7247263 | 5.8% |
| Other values (7) | 47558440 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 77413991 | |
| Lowercase Letter | 43758060 | |
| Dash Punctuation | 4513971 | 3.6% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 11986309 | |
| 8 | 7446464 | |
| 5 | 7411674 | |
| 3 | 7284894 | |
| 9 | 7248636 | |
| 0 | 7247263 | |
| 2 | 7244423 | |
| 4 | 7209903 | |
| 6 | 7175653 | |
| 7 | 7158772 |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 7465781 | |
| f | 7392447 | |
| d | 7344098 | |
| c | 7300016 | |
| e | 7246027 | |
| b | 7009691 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4513971 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 81927962 | |
| Latin | 43758060 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 11986309 | |
| 8 | 7446464 | |
| 5 | 7411674 | |
| 3 | 7284894 | |
| 9 | 7248636 | |
| 0 | 7247263 | |
| 2 | 7244423 | |
| 4 | 7209903 | |
| 6 | 7175653 | |
| 7 | 7158772 |
Latin
| Value | Count | Frequency (%) |
| a | 7465781 | |
| f | 7392447 | |
| d | 7344098 | |
| c | 7300016 | |
| e | 7246027 | |
| b | 7009691 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 125686022 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 11986309 | 9.5% |
| a | 7465781 | 5.9% |
| 8 | 7446464 | 5.9% |
| 5 | 7411674 | 5.9% |
| f | 7392447 | 5.9% |
| d | 7344098 | 5.8% |
| c | 7300016 | 5.8% |
| 3 | 7284894 | 5.8% |
| 9 | 7248636 | 5.8% |
| 0 | 7247263 | 5.8% |
| Other values (7) | 47558440 |
product_id
Text
| Distinct | 94102 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.3 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 261105152 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10979 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 006ae35b6f0aae363ff038ffd44ad049 |
|---|---|
| 2nd row | 780024e0152928b310df607663294dd4 |
| 3rd row | 22981f293030ae132845164a0ba728e4 |
| 4th row | 6c3d6e8b176b711d066df803470efbaa |
| 5th row | 7916bb6025e4b42bbdbf48e1faacad90 |
| Value | Count | Frequency (%) |
| cd784f9c80112f207a06e23328ce0edb | 85140 | 1.0% |
| 4eb0ed65f0f049a72e10c2949d09183c | 51342 | 0.6% |
| ed92ef3044e2b4c0978a26b787f08fed | 43302 | 0.5% |
| 61c161228c0bf8c5ce7ddb6c94465971 | 33152 | 0.4% |
| c6f818a8a9fd204094af38b1b8df93e4 | 30456 | 0.4% |
| 07eaaef37bcf9fe8f8332ce94dc64652 | 30057 | 0.4% |
| 5f616d6c685f8907deaf6778821ab3d8 | 29930 | 0.4% |
| cbbcf199e72782c1e61166d3e0197603 | 27626 | 0.3% |
| 5cd8b51aee5f4501dd649e5b11f6ca8f | 22644 | 0.3% |
| b737da1bfe91eca560c62a26b8478aaf | 21536 | 0.3% |
| Other values (94092) | 7784351 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 16962593 | 6.5% |
| 0 | 16709056 | 6.4% |
| 8 | 16523573 | 6.3% |
| 4 | 16518134 | 6.3% |
| 6 | 16389893 | 6.3% |
| 2 | 16356584 | 6.3% |
| 3 | 16340057 | 6.3% |
| 7 | 16325711 | 6.3% |
| b | 16311496 | 6.2% |
| 9 | 16215004 | 6.2% |
| Other values (6) | 96453051 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 164272803 | |
| Lowercase Letter | 96832349 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 16962593 | |
| 0 | 16709056 | |
| 8 | 16523573 | |
| 4 | 16518134 | |
| 6 | 16389893 | |
| 2 | 16356584 | |
| 3 | 16340057 | |
| 7 | 16325711 | |
| 9 | 16215004 | |
| 5 | 15932198 |
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 16311496 | |
| f | 16186917 | |
| c | 16138763 | |
| e | 16091419 | |
| a | 16077959 | |
| d | 16025795 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 164272803 | |
| Latin | 96832349 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 16962593 | |
| 0 | 16709056 | |
| 8 | 16523573 | |
| 4 | 16518134 | |
| 6 | 16389893 | |
| 2 | 16356584 | |
| 3 | 16340057 | |
| 7 | 16325711 | |
| 9 | 16215004 | |
| 5 | 15932198 |
Latin
| Value | Count | Frequency (%) |
| b | 16311496 | |
| f | 16186917 | |
| c | 16138763 | |
| e | 16091419 | |
| a | 16077959 | |
| d | 16025795 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 261105152 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 16962593 | 6.5% |
| 0 | 16709056 | 6.4% |
| 8 | 16523573 | 6.3% |
| 4 | 16518134 | 6.3% |
| 6 | 16389893 | 6.3% |
| 2 | 16356584 | 6.3% |
| 3 | 16340057 | 6.3% |
| 7 | 16325711 | 6.3% |
| b | 16311496 | 6.2% |
| 9 | 16215004 | 6.2% |
| Other values (6) | 96453051 |
quantity
Real number (ℝ)
SKEWED
| Distinct | 18825 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.018986 |
| Minimum | -70 |
|---|---|
| Maximum | 189 |
| Zeros | 6001 |
| Zeros (%) | 0.1% |
| Negative | 1311 |
| Negative (%) | < 0.1% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | -70 |
|---|---|
| 5-th percentile | 0.2865 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 189 |
| Range | 259 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.60034183 |
|---|---|
| Coefficient of variation (CV) | 0.58915613 |
| Kurtosis | 6292.098 |
| Mean | 1.018986 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 41.78601 |
| Sum | 8314452.7 |
| Variance | 0.36041031 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 6904507 | |
| 2 | 329533 | 4.0% |
| 3 | 53435 | 0.7% |
| 0.5 | 30225 | 0.4% |
| 4 | 29284 | 0.4% |
| 5 | 7667 | 0.1% |
| 6 | 6247 | 0.1% |
| 0 | 6001 | 0.1% |
| 0.2 | 4651 | 0.1% |
| 0.1 | 3266 | < 0.1% |
| Other values (18815) | 784720 | 9.6% |
| Value | Count | Frequency (%) |
| -70 | 1 | < 0.1% |
| -20 | 2 | |
| -15 | 1 | < 0.1% |
| -10 | 1 | < 0.1% |
| -9.712 | 1 | < 0.1% |
| -9 | 1 | < 0.1% |
| -8 | 1 | < 0.1% |
| -7 | 2 | |
| -6.791 | 4 | |
| -6.4444 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 189 | 1 | < 0.1% |
| 160 | 2 | |
| 156 | 1 | < 0.1% |
| 120 | 3 | |
| 109 | 1 | < 0.1% |
| 107.3778 | 2 | |
| 96 | 1 | < 0.1% |
| 91 | 1 | < 0.1% |
| 90 | 4 | |
| 86 | 1 | < 0.1% |
price
Real number (ℝ)
| Distinct | 11655 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.686815 |
| Minimum | -39.9 |
|---|---|
| Maximum | 2432 |
| Zeros | 60 |
| Zeros (%) | < 0.1% |
| Negative | 9 |
| Negative (%) | < 0.1% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | -39.9 |
|---|---|
| 5-th percentile | 1.9 |
| Q1 | 4.9 |
| median | 9.95 |
| Q3 | 16.5 |
| 95-th percentile | 39.5 |
| Maximum | 2432 |
| Range | 2471.9 |
| Interquartile range (IQR) | 11.6 |
Descriptive statistics
| Standard deviation | 16.202048 |
|---|---|
| Coefficient of variation (CV) | 1.1837705 |
| Kurtosis | 159.54173 |
| Mean | 13.686815 |
| Median Absolute Deviation (MAD) | 5.35 |
| Skewness | 5.8110218 |
| Sum | 1.1167806 × 108 |
| Variance | 262.50636 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 237992 | 2.9% |
| 7 | 174905 | 2.1% |
| 10 | 174693 | 2.1% |
| 15 | 115466 | 1.4% |
| 12 | 113503 | 1.4% |
| 8 | 112018 | 1.4% |
| 4.5 | 111117 | 1.4% |
| 3.4 | 110601 | 1.4% |
| 4 | 100250 | 1.2% |
| 13.5 | 83216 | 1.0% |
| Other values (11645) | 6825775 |
| Value | Count | Frequency (%) |
| -39.9 | 1 | < 0.1% |
| -20.8 | 5 | < 0.1% |
| -11.2 | 2 | < 0.1% |
| -5.7 | 1 | < 0.1% |
| 0 | 60 | |
| 0.001 | 7 | < 0.1% |
| 0.0011 | 14 | < 0.1% |
| 0.0012 | 6 | < 0.1% |
| 0.0013 | 10 | < 0.1% |
| 0.0014 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 2432 | 1 | < 0.1% |
| 1948.72 | 1 | < 0.1% |
| 1752 | 1 | < 0.1% |
| 995 | 1 | < 0.1% |
| 987.56 | 1 | < 0.1% |
| 925.9 | 2 | |
| 925 | 4 | |
| 780 | 1 | < 0.1% |
| 660 | 1 | < 0.1% |
| 600 | 1 | < 0.1% |
category_id
Real number (ℝ)
HIGH CORRELATION
| Distinct | 398 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 236.08314 |
| Minimum | 1 |
|---|---|
| Maximum | 398 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 135 |
| Q1 | 201 |
| median | 231 |
| Q3 | 266 |
| 95-th percentile | 359 |
| Maximum | 398 |
| Range | 397 |
| Interquartile range (IQR) | 65 |
Descriptive statistics
| Standard deviation | 68.375436 |
|---|---|
| Coefficient of variation (CV) | 0.2896244 |
| Kurtosis | 1.2882789 |
| Mean | 236.08314 |
| Median Absolute Deviation (MAD) | 32 |
| Skewness | -0.30599765 |
| Sum | 1.9263289 × 109 |
| Variance | 4675.2003 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 220 | 805849 | 9.9% |
| 222 | 339379 | 4.2% |
| 255 | 328081 | 4.0% |
| 250 | 269489 | 3.3% |
| 235 | 217856 | 2.7% |
| 201 | 209272 | 2.6% |
| 179 | 186996 | 2.3% |
| 232 | 177660 | 2.2% |
| 183 | 163019 | 2.0% |
| 175 | 146159 | 1.8% |
| Other values (388) | 5315776 |
| Value | Count | Frequency (%) |
| 1 | 4 | < 0.1% |
| 2 | 5844 | 0.1% |
| 3 | 2291 | < 0.1% |
| 4 | 20 | < 0.1% |
| 5 | 120 | < 0.1% |
| 6 | 34929 | |
| 7 | 6152 | 0.1% |
| 8 | 3 | < 0.1% |
| 9 | 766 | < 0.1% |
| 10 | 2355 | < 0.1% |
| Value | Count | Frequency (%) |
| 398 | 408 | < 0.1% |
| 397 | 5246 | 0.1% |
| 396 | 1188 | < 0.1% |
| 395 | 96 | < 0.1% |
| 394 | 103 | < 0.1% |
| 393 | 2689 | < 0.1% |
| 392 | 7387 | |
| 391 | 3978 | < 0.1% |
| 390 | 16607 | |
| 389 | 4443 | 0.1% |
parent_id
Real number (ℝ)
HIGH CORRELATION
| Distinct | 59 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33.820996 |
| Minimum | 1 |
|---|---|
| Maximum | 59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 17 |
| Q1 | 25 |
| median | 29 |
| Q3 | 43 |
| 95-th percentile | 59 |
| Maximum | 59 |
| Range | 58 |
| Interquartile range (IQR) | 18 |
Descriptive statistics
| Standard deviation | 12.527912 |
|---|---|
| Coefficient of variation (CV) | 0.37041817 |
| Kurtosis | -0.50480476 |
| Mean | 33.820996 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.26821554 |
| Sum | 2.7596363 × 108 |
| Variance | 156.94857 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 25 | 989119 | 12.1% |
| 24 | 851940 | 10.4% |
| 29 | 765885 | 9.4% |
| 59 | 505856 | 6.2% |
| 27 | 462686 | 5.7% |
| 42 | 408439 | 5.0% |
| 41 | 377014 | 4.6% |
| 43 | 293786 | 3.6% |
| 19 | 284788 | 3.5% |
| 40 | 254015 | 3.1% |
| Other values (49) | 2966008 |
| Value | Count | Frequency (%) |
| 1 | 4 | < 0.1% |
| 2 | 23218 | 0.3% |
| 3 | 15095 | 0.2% |
| 4 | 59230 | |
| 5 | 1706 | < 0.1% |
| 6 | 35382 | |
| 7 | 7752 | 0.1% |
| 8 | 35103 | |
| 9 | 118 | < 0.1% |
| 10 | 6602 | 0.1% |
| Value | Count | Frequency (%) |
| 59 | 505856 | |
| 58 | 42266 | 0.5% |
| 57 | 85382 | 1.0% |
| 56 | 9 | < 0.1% |
| 55 | 400 | < 0.1% |
| 54 | 8004 | 0.1% |
| 53 | 55351 | 0.7% |
| 52 | 64264 | 0.8% |
| 51 | 43146 | 0.5% |
| 50 | 143618 | 1.8% |
store_id
Real number (ℝ)
| Distinct | 39 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.705304 |
| Minimum | 1 |
|---|---|
| Maximum | 40 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 8 |
| Q3 | 18 |
| 95-th percentile | 31 |
| Maximum | 40 |
| Range | 39 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 8.9790947 |
|---|---|
| Coefficient of variation (CV) | 0.76709624 |
| Kurtosis | 0.17346227 |
| Mean | 11.705304 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.98897472 |
| Sum | 95509850 |
| Variance | 80.624142 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 917002 | 11.2% |
| 7 | 834722 | 10.2% |
| 2 | 518525 | 6.4% |
| 3 | 473091 | 5.8% |
| 5 | 448778 | 5.5% |
| 8 | 423558 | 5.2% |
| 1 | 368196 | 4.5% |
| 22 | 336921 | 4.1% |
| 4 | 336227 | 4.1% |
| 9 | 274005 | 3.4% |
| Other values (29) | 3228511 |
| Value | Count | Frequency (%) |
| 1 | 368196 | |
| 2 | 518525 | |
| 3 | 473091 | |
| 4 | 336227 | 4.1% |
| 5 | 448778 | |
| 6 | 917002 | |
| 7 | 834722 | |
| 8 | 423558 | |
| 9 | 274005 | 3.4% |
| 10 | 156648 | 1.9% |
| Value | Count | Frequency (%) |
| 40 | 17254 | 0.2% |
| 39 | 20717 | 0.3% |
| 38 | 13214 | 0.2% |
| 36 | 49680 | |
| 35 | 97966 | |
| 34 | 41619 | |
| 33 | 52337 | |
| 32 | 65762 | |
| 31 | 77264 | |
| 30 | 30425 | 0.4% |
department_id
Categorical
HIGH CORRELATION
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.3 MiB |
| 2 | |
|---|---|
| 3 | |
| 4 | |
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 8159536 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 2 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 8159536 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8159536 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8159536 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 4151760 | |
| 3 | 2198053 | |
| 4 | 1366211 | 16.7% |
| 1 | 443512 | 5.4% |
salesperson_id
Real number (ℝ)
HIGH CORRELATION
| Distinct | 625 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 322.54161 |
| Minimum | 1 |
|---|---|
| Maximum | 625 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 62.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 92 |
| Q1 | 165 |
| median | 368 |
| Q3 | 427 |
| 95-th percentile | 577 |
| Maximum | 625 |
| Range | 624 |
| Interquartile range (IQR) | 262 |
Descriptive statistics
| Standard deviation | 156.71991 |
|---|---|
| Coefficient of variation (CV) | 0.48589052 |
| Kurtosis | -1.1022361 |
| Mean | 322.54161 |
| Median Absolute Deviation (MAD) | 116 |
| Skewness | -0.1685921 |
| Sum | 2.6317899 × 109 |
| Variance | 24561.129 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 108 | 787163 | 9.6% |
| 417 | 763096 | 9.4% |
| 428 | 206131 | 2.5% |
| 214 | 185266 | 2.3% |
| 427 | 123884 | 1.5% |
| 461 | 105869 | 1.3% |
| 584 | 99526 | 1.2% |
| 446 | 84389 | 1.0% |
| 409 | 83519 | 1.0% |
| 511 | 80368 | 1.0% |
| Other values (615) | 5640325 |
| Value | Count | Frequency (%) |
| 1 | 19 | < 0.1% |
| 2 | 998 | < 0.1% |
| 3 | 3988 | |
| 4 | 4417 | |
| 5 | 1002 | < 0.1% |
| 6 | 1290 | < 0.1% |
| 7 | 1385 | < 0.1% |
| 8 | 2756 | |
| 9 | 3301 | |
| 10 | 4081 |
| Value | Count | Frequency (%) |
| 625 | 345 | < 0.1% |
| 624 | 5083 | |
| 623 | 4376 | |
| 622 | 6323 | |
| 621 | 10212 | |
| 620 | 10912 | |
| 619 | 3779 | < 0.1% |
| 618 | 1217 | < 0.1% |
| 617 | 114 | < 0.1% |
| 616 | 4521 |
| category_id | department_id | parent_id | price | quantity | salesperson_id | store_id | |
|---|---|---|---|---|---|---|---|
| category_id | 1.000 | 0.934 | 0.836 | -0.139 | -0.148 | 0.848 | -0.227 |
| department_id | 0.934 | 1.000 | 0.921 | -0.142 | -0.178 | 0.919 | -0.241 |
| parent_id | 0.836 | 0.921 | 1.000 | -0.101 | -0.130 | 0.846 | -0.228 |
| price | -0.139 | -0.142 | -0.101 | 1.000 | -0.217 | -0.128 | 0.094 |
| quantity | -0.148 | -0.178 | -0.130 | -0.217 | 1.000 | -0.162 | 0.115 |
| salesperson_id | 0.848 | 0.919 | 0.846 | -0.128 | -0.162 | 1.000 | -0.222 |
| store_id | -0.227 | -0.241 | -0.228 | 0.094 | 0.115 | -0.222 | 1.000 |
| transaction_id | sales_datetime | customer_id | product_id | quantity | price | category_id | parent_id | store_id | department_id | salesperson_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2e18343f9b9a95e89587273536e59d6e | 2011-01-01 09:04:00 | -1 | 006ae35b6f0aae363ff038ffd44ad049 | 1.0 | 13.5 | 208 | 29 | 18 | 2 | 108 |
| 1 | d53096e90b515b563631b18acfa4d364 | 2011-01-01 09:04:00 | -1 | 780024e0152928b310df607663294dd4 | 1.0 | 6.5 | 179 | 30 | 17 | 2 | 108 |
| 2 | d53096e90b515b563631b18acfa4d364 | 2011-01-01 09:04:00 | -1 | 22981f293030ae132845164a0ba728e4 | 1.0 | 6.5 | 179 | 30 | 17 | 2 | 108 |
| 3 | 2c2296658e7f9ae94954b5836214de76 | 2011-01-01 09:08:00 | -1 | 6c3d6e8b176b711d066df803470efbaa | 1.0 | 10.5 | 175 | 29 | 1 | 2 | 108 |
| 4 | 2c2296658e7f9ae94954b5836214de76 | 2011-01-01 09:08:00 | -1 | 7916bb6025e4b42bbdbf48e1faacad90 | 1.0 | 17.5 | 208 | 29 | 1 | 2 | 108 |
| 5 | d97df679d6c9d97fc9b5f96f2571f004 | 2011-01-01 09:13:00 | -1 | 11ec26ec94745367e5835a7b2826ff64 | 1.0 | 4.5 | 162 | 29 | 1 | 2 | 108 |
| 6 | fce651c756b339e8ccac5016a716abe8 | 2011-01-01 09:22:00 | -1 | 718d92d49ddbe79e535f2040ad69ba54 | 1.0 | 2.8 | 185 | 25 | 1 | 2 | 108 |
| 7 | 71a0acdaa0a56543a0f99aad44672b8b | 2011-01-01 09:34:00 | -1 | 61c161228c0bf8c5ce7ddb6c94465971 | 1.0 | 6.5 | 175 | 29 | 8 | 2 | 108 |
| 8 | 0be4cc43ca4d02734b8e3771270bdae2 | 2011-01-01 09:36:00 | -1 | ba149eec842286d64af617372fe3d1c9 | 1.0 | 5.0 | 162 | 29 | 8 | 2 | 108 |
| 9 | 95b5c2ec0badb42dc1e0204a6728ee30 | 2011-01-01 09:37:00 | -1 | d3e552cb8ef0db17c0b86b2a991ebc00 | 1.0 | 21.3 | 232 | 27 | 17 | 2 | 108 |
| transaction_id | sales_datetime | customer_id | product_id | quantity | price | category_id | parent_id | store_id | department_id | salesperson_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 8159526 | 4c46be3fd071b49472f8f3888a00988f | 2014-10-01 20:50:00 | 393359353d6413f3603a0ac05c3616df | f07582f241ab4ffa2bc611c1f504f23a | 1.0 | 12.00 | 222 | 25 | 22 | 2 | 396 |
| 8159527 | 4c46be3fd071b49472f8f3888a00988f | 2014-10-01 20:50:00 | 393359353d6413f3603a0ac05c3616df | 351c168ec4e555703ecf6d13b1a22fb9 | 1.0 | 3.90 | 220 | 24 | 22 | 2 | 396 |
| 8159528 | cceb4e83f4516f7cb39d2fb6a5638cae | 2014-10-01 20:51:00 | 966a62a203014b33b22c11fcdd8b7f52 | 54b46942e898acd7b3fb1e0763ad042f | 1.0 | 9.95 | 158 | 25 | 2 | 2 | 367 |
| 8159529 | 36089a5459f6b5c228438c5557306d12 | 2014-10-01 20:54:00 | 099976a2741063dcc519d99b720d0280 | 8bb0c3e45ac5722d50d2ccca1ea92099 | 1.0 | 5.00 | 179 | 30 | 22 | 2 | 396 |
| 8159530 | 36089a5459f6b5c228438c5557306d12 | 2014-10-01 20:54:00 | 099976a2741063dcc519d99b720d0280 | 79e4745ef8f6153f56cf3af3858c4695 | 1.0 | 5.00 | 179 | 30 | 22 | 2 | 396 |
| 8159531 | 375d02cbac732e7655763366474ddb3c | 2014-10-01 20:55:00 | -1 | 221eece5c0f81c92b7f01b0e2a0573eb | 1.0 | 12.00 | 222 | 25 | 22 | 2 | 396 |
| 8159532 | 2052ddaefe02020e1794a963c8e06d8a | 2014-10-01 20:56:00 | 4d0e1076f08aa463fbd18f2553291482 | 6d2ea0ac8f7b1263cf21171522176976 | 1.0 | 12.00 | 222 | 25 | 22 | 2 | 396 |
| 8159533 | 2052ddaefe02020e1794a963c8e06d8a | 2014-10-01 20:56:00 | 4d0e1076f08aa463fbd18f2553291482 | 5285184b43c1bc18d941d2a0ab6123e4 | 1.0 | 12.00 | 161 | 27 | 22 | 2 | 396 |
| 8159534 | 67d3974eba51b09878bc0ecab84c5555 | 2014-10-01 20:57:00 | -1 | 6d2ea0ac8f7b1263cf21171522176976 | 1.0 | 12.00 | 222 | 25 | 22 | 2 | 396 |
| 8159535 | fb6dd9d66e769a2e36407b04777b8dbc | 2014-10-01 21:00:00 | -1 | 44501de44aff5f038816bba26f0b46d1 | 1.0 | 16.50 | 180 | 21 | 22 | 2 | 396 |
Most frequently occurring
| transaction_id | sales_datetime | customer_id | product_id | quantity | price | category_id | parent_id | store_id | department_id | salesperson_id | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 62496 | d8734b9e2874373bfacf25e1c94aa532 | 2012-05-24 14:27:00 | -1 | 5d867c72bbb1a8277429673844ddb8e1 | 1.0 | 5.00 | 397 | 52 | 15 | 4 | 471 | 37 |
| 30026 | 673defde9a3efdf108d98a98c30b1e4b | 2012-11-15 16:40:00 | c0d64e6d4768df5200f0e624ab5942c4 | cd784f9c80112f207a06e23328ce0edb | 1.0 | 64.95 | 298 | 47 | 12 | 4 | 620 | 36 |
| 19970 | 447092ce5b59d5a3b8e0bdedfb664e5d | 2012-10-26 14:58:00 | c0d64e6d4768df5200f0e624ab5942c4 | cd784f9c80112f207a06e23328ce0edb | 1.0 | 64.95 | 298 | 47 | 12 | 4 | 619 | 28 |
| 72023 | f8de84f89fd708945aa82b27b8568d34 | 2013-06-03 13:28:00 | -1 | 27da5b58a4414e20f99af385ebb76f94 | 1.0 | 3.48 | 352 | 59 | 20 | 4 | 524 | 24 |
| 72349 | f9f6bd7f0d7b1fe2705f04431e55a4d4 | 2013-06-28 15:33:00 | a43d0304223cf0c2129e223b89ab577a | cd784f9c80112f207a06e23328ce0edb | 1.0 | 2.97 | 298 | 47 | 10 | 4 | 604 | 24 |
| 5458 | 12c53a6f6d738458c5ac003dda3932c7 | 2012-01-04 12:01:00 | c0d64e6d4768df5200f0e624ab5942c4 | 5d867c72bbb1a8277429673844ddb8e1 | 1.0 | 10.00 | 397 | 52 | 15 | 4 | 463 | 23 |
| 5490 | 12d8423104f43bc45e7c9c9acc0508d6 | 2012-01-27 10:56:00 | 0383d192ffc2a8fbadceef1d50d1d6b6 | cd784f9c80112f207a06e23328ce0edb | 1.0 | 4.78 | 298 | 47 | 9 | 4 | 601 | 16 |
| 12712 | 2b32f7d00671f3bc3b4743001dbb5120 | 2012-07-11 15:08:00 | -1 | cd784f9c80112f207a06e23328ce0edb | 1.0 | 2.97 | 298 | 47 | 22 | 4 | 523 | 16 |
| 14948 | 32f9c26e390b5e8c743eccaaa02c9c01 | 2013-10-04 15:49:00 | -1 | cd784f9c80112f207a06e23328ce0edb | 1.0 | 3.47 | 298 | 47 | 10 | 4 | 604 | 16 |
| 17455 | 3bad4843c9f4769e0f0d7b0e54cdd5a3 | 2012-11-22 16:54:00 | ebc1532b5f3654d8aadb91a2c68c808d | cd784f9c80112f207a06e23328ce0edb | 1.0 | 64.95 | 298 | 47 | 12 | 4 | 620 | 16 |